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Illustrative Examples of the Development and Interpretation 
of Hierarchical Tests in the Field 
of Learning Disabilities* 

Richard J. Hofinann 
Miami University 

• In the area of attitude measurement the Guttman (1950) scale has been 
recognized and used as a model for many years. In the area of cognitive 
assessment the Guttman scale has a great potential that has not yet been 
capitalized. To this end a new measurement model has been developed, a 
hierarchical test. When several hierarchical tests are intercorrelated they 
will have that elusive property referred to as "G". That is, when factor 
analyzed they usually will have hierarchical loadings on a factor, and if 
from the same content domain they will define just one factor. The objectives, 
of this manU^-Ript are to simply discuss hierarchical tests and to present 
several illustrative examples of their use with binary response data: i.e., 
right-wrong, yes-no, etc. 

The hierarchical test has several characteristics not found in tradi- 
tional assessment instruments. These characteristics are based upon the pro- 
perty that on such a test the individual item response patterns of a large 
majority of the responding individuals are highly predictable and orderly. 
With regard to the measurement, identification, and understanding of certain •. 
types of learning disabilities, the hierarchical test may provide insights .v 
and new measurement approaches. It is possible that curric Hum may be developed 
hierarchically in terms of subordinate knowledge as determir,. : normatively 
by the item content of a test. 

*This term is used in a very general fashion throughout this manuscript. 
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'The nature of hierarchi cal tests . If the items defining an assessment instru- 
ment are similar to those portrayed by Figure 1 then they are all measuring 
different levels of a single content domain and might be said to be defining 
a hierarchical test. Such a test is composed of non-redundant binary items 
measuring different levels of mastery within the same content domain. That 
is, the item difficulty varies as opposed to a traditional homogeneous test 
assumed to be composed completely of items all measuring the same level of 
mastery, a classic mastery test. One particular compelling property of a 
hierarchical test is that the total score (number of correct or positive 
responses) that one obtains may be interpreted within a criterion-referenced 
framework or meaningful "product framework" from the view that a majority of 
the response patterns will be orderly and well behaved. To the extent that 
the test is a "perfect hierarchical test" the total score will actually define 
without error the response pattern (correct and incorrect item responses to 
each item) or processing, for each and every response. For example, if we 
assume that we have a perfect six item hierarchical test, a six item assessment 



instrument, and some individual obtains a socre of four correct responses, 
then this individual responded correctly to the four easiest items. If we 
understand both the content and construct validity of the domain associated 
witfr the test items we can make interpretations of an individual's score -.riJi.- 
directly in terms of specified performance standards- 

Clearly one will not usually have a perfect hierarchical test. For j^;. 
some respondents we will have less than perfect prediction of their item 
response pattern. This is to be expected just by chance, however for some 
respondents we will have extremely poor accuracy in predicting their response 
patterns. Whereas a typical test would have one score, the composite score, 
a hierarchical test will have two scores: a composite score determined by 
the summation of correct responses; and error score determined as the number Jn-^ 
of responses incorrectly predicted for an individual when attempting to pre- ^ * 
diet their item response pattern given their composite score or the degree 
of "composite confusion". To the extent that an individual has a large error 
score, his item response pattern and probably his cognitive processing would' 
be normatively atypical. 

Example 1: Computational Example of Reproducibility, Error (Composite Confusion) 

Assume that a group of k tasks (items, responses and so on) have been ' . 
obtained. These items are ordered on the basis of empirical observation from , 
easiest to hardest. Assuming the items to be associated with a perfect Girt- . 
tman scale and assuming the items to be ordered from easiest, item 1, to hardest; 
item j< then no subject with j^ correct responses will respond to any item m . / • 
where m is more difficult than j^. Following a similar logic this same subject 
will respond correctly to any item i_ where 1 is as easy or easier than ^. ■ 



Within the framework of binary responses 1 is an affirmative response 
and 0 is a negative response. With a perfect Guttman scale one would not 
anticipate any pattern of the nature (01) for a two item easy-hard sequenc- 
ing. Such a pattern would be empirically illogical and disconfirming of the 
empirically based easy-hard sequencing of the two items. Generalizing this 
concept to items an index called reproducibility has been developed to ' 
quantify how well the data conform to these assumptions. Reproducibility is 
just the proportion of responses correctly predicted for a group of n subjects 
on k_ tasks given their individual composite scores. The composite score for 
each individual is just the number of connect responses made by the individual 
as previously noted. 

In Table 1 an artificial response matrix is presented. There are ten 
subjects and six items. The items have been ordered from left to right, 
difficult to easy. The subjects have been ordered from top to bottom, highest 
score to lowest score. Notice that all orderings are empirical or normative. 
The item difficulties from left to right are .3, ,4, .5, .6, .7 and .8. 
Because there are tied composite scores the orderings within a score level 
are arbitrary. How v;ell can the total response patterns of all subjects be 
reproduced? There are 10 subjects and six items, thus a total of 60 responses 
to predict. A total of 14 errors of prediction were made, thus 46 responses 
were correctly predicted or 77 percent accurracy. The reproducibility of 
the itois Is then .77. Alternatively 23 percent error occurred. This is a 
'iaroe percentage of error most likely it is more than one would tolerate. * 

V> or might occur for any one of three reasons: (a) there may be a bad 
item(s) in the test such as item 4; (b) there may be several (never more than 
several) st.'*"erts for whom the item orderings are inapplicable; (c) the test 
is is just J '^o^-r test. 
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It is especially interesting to note that the reproducibility is nothing 
more than the compliment of the average precent error for each subject. In- 
asmuch as the model of a hierarchical test is usually normatively based it t. a^jf?. 
provides an opportunity to identify subjects who are normatively atypical.- "-uvV 
To wit, subject A in Table 1 obtained a raw score of 3 on the test. The ' 
average score is 3, thus one might not be concerned about a raw score of 3 
but note that subject A obtained an error score of 6 when the average error. ^ 
score is 1.4. Clearly subject A is normatively atypical on this test and 
worthy of additional investigation. Such additional investigation would of . • ;^ 
course b5 initially based upon the content domain of the test. Certainly . 

to the extent that the test has a low reproducibility and a number of sub- 

' • : ^- ' ^ 

jects with errors of prediction such an interpretation is not warranted 

■ ■■" T' ' 

as the test does not conform well enough to the hierarchical^ model. The i-i . 
virtue of a hierarchical test is minimal composite confusion or errors of 
prediction- It seems thatall tests purporting to use a composite score for ' 
any type of decision or regression analysis should be free of or have at least 
minimal composite confusion. 

Example 2: Comparative Analysis of Normal and Learning Disabled Errors of ^ . 



Prediction on a Hierarchical Test of Seriation by Sense Modal ity— Cognitive ':({^ 



Processing , ^.i-WCj 

■ ■ -r^"\r 

In a recent unpublished (yet to be completed) study 18 ten year old .V-y^^ff 

children from learning disability classes were compared to 74 children from .y/^.;>, 

normal classes (seven to ten years of age) with regard to cognitive jirocessing 

used in conjunction with various sense modalities. Sixteen tasks were devised; ? 

such that the children were required to sort a group of objects from smooth:,-' I' 

to rough (tactile), a second group from light to heavy ( kinesthetic) , a thi>d,,;.|. 

.. .• 
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'group from white to dark ( visual ), and a fourth group of objects from short, 
to long. Such tasks are properly referred to as seriation tasks in the sense , | ■.; 
Of Inhelder and Piaget (1964). After each initial sorting the children were v'^-^^.; 

•j:. . .. ■ !..-., .~ i,, ,j r. , '■-'>•.- -v.^, 

given three additional objects logically associated with the sorted group . - 
and asked to insert these additional objects into their proper positions within '.v'^- 
the sorted group. These tasks were all logically equivalent but because of ... V^v^;' 
the degree of sense discrimination required and the various sense modalities f.>. -^^^^^^ 
used they varied in difficulty. ' ;. vV;; 

■ The reproducibility of the 16 tasks for the 73 children from the normal \^/J: 
classrooms was .81. Similarly the reproducibility for the learning disabled ... 
children for the same hierarchical test was .81! Clearly the tasks defined . [-[-^T-^^ 
a .hierarchical test of modality seriation. Of primary importance were the dis- . 
tributions of errors of prediction for the normal and learning disabled children. 
If the distributions were significantly different from each other this would ..-^ 
suggest that the cognitive processing of the learning disabled children was 
• different from the cognitive processing of the normal children used to normatively.^^ 
establish the hierarchical test. v>cr 
On this test there was a possible maximum of 16 errors. The errors on 'V' 
any hierarchical test will always occur in multiples of two thus the range of .. - 
error pairs on this test was from 0 to 8. The error distributions for this : .; 
'''test are reported in Table 2. Eliminating the last column of Table 2 a.chi . 'f^ 
square test of independence was conducted to determine if the frequencies in any -f; 
: of the paired error categories tended to occur with greater or less probabil ity <t;i,-.:^ 
for either the normal or learning disabled children. The resulting chi.-.square.C^';^^ 
'■• [x^(4)=3.70, p>.05] was not significant thereby suggesting that the ^'^^equency ' 

■ • . . ''^^i^M 

.•.■'../.V'V 

• . , • k^,:/ 



Table 2. Score error frequency distributions from hierarchical test of V > 
' * sense modality seriation. 
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^Score errors always occur in multiples of two, thus the. distribution is 
described in multiples of two. 



• ^Because 1. subject falls in this category it was ommited from analysis. 

1 * 
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of errors of prediction occurred independently of the categories learning ?- 

disabled and normal* 

It was concluded on the basis of these findings tbdt the cognitive pro-*/v'.:> 

'* * *■ 

cessing of the two groups was the same. Note that reference was never made 0' 
to the number of correct responses on the test as this would have addressed -r/v 
a different question, level of knowledge. Such a question is ideally addres-'/ 

♦ ' t ' 

sed by a hierarchical test but simply was not part of the study just de^scribed.'?^^* 

. ■ ■ 'f ^' Vv 
-r *^ 

(Interestingly enough there was a difference with regard to level of kncwledge). 

The substantive implications of tnese findings are beyond the scope 
of this paper, however it should be clear from this example that one particu- - 
'larly compelling use of f properly defined hierarchical test is the comparative - 
analysis of several well defined groups such as normal and learning disabled.,/ ' ' 
children. * * , ' > 

Example 3: Constructing a Normatively-Based Hierarchical Test for the Early 

; ! • 

Identification of Learning Problems. '\. 

Assume that one has a ni^inber of items that purportedly represent what is i/fi...; 
felt to be a content domain. Furthermore, one would like very much to have 
thv^ ^tems define a test having the previously mentioned features, but ti^ere i*. ii^^V; 
is seme question about which items do or do not really belong to the content 
domain, or worse yet (Example 4), one thinks that he knows which I'tems belong ■ - ■^.^ 
to the domain and how they define the content domain — but' this knowledge is'v.;V«^^^ 
in error. Until very recently there seemed to be no way of knowing which itemsr'^^^': 
did or did not bf^long and there seemed to be no way of knowing whether or not»\i-^-{^f 
one's a priori assL';nptions about the content domain were in 'error. , :V '.^•.'♦ivV 

Recently a rather large school district developed an extensive pre-schoql vj^jij^f 

■ \ ■ 

inventory (50 items). It was clear to them that the instrument w^s not a -^l^^^^^ 

1^ . : -^^.^'j^-^i 



single factor instrument. To use d score based upon the total numb'=.r of ?' • 
correct responses would have resulted in a test with low validity. Initially^' - 
one might assume that a factor analysis would aid in defining mote val id, sub-;--. -S: 
tests. Unfortunately eich item was either correct or incorrect, binary, thus%4- 
a factor analysis of the viata would have resulted in what is typically referred^* 
to as d-'fficulty factors, e.g., ecsy items cluster together, difficult items 
cluster together, extreme items cluster together and so cn. The items will ' 
cluster together not because o*^ their content but because of their difficulty^.'i 
thus the subtests might or might .>ot be more valid than the total test. This" '-'J,- 
is an especially severe problem as one of the major objectives of the instru-- " 
ment was to serve as a screening device to identify children with potential .r.' ii 
learning problems, i.e., early identification. When properly developed such . ^ 
instruments use as a criterion some later measure of learning. If it is known 
that there are validity problems initially it seems senseless to conduct a . 
longitudins study. ' . -/ivs 

The most logica'' approach seemed to be one of identifying h-ierarchical . 
subtests. To this end a new multivariate procedure has been developed, ','. '> 

Multiple Hierarchical Analysis (Hofmann, Note 1). The multivariate model J 
will not be discussed in this manuscript let it suffice that the model identi- ■'• 
ties latent Guttman scales in the data and then determines the best real .{ 
data approximations to these latent scales. The real data approximations 
are just hierarchical tests! However, these hierarchical tests are not can-'..'' W 
posed of all of the items in the test battery rather, those items that appear^-^i..' 
not to belong to the content domain associated with the hierarchical test ■'" '--M^ 
are excluded. As a result several hierarchical subtests are derived from the V-'^"' 
original test battery. These subtests may have certain items in comonan(i-:-ji:0H 
there may be certain items excluded from all of the subtests. Typically' thl^'S^yl 



scores on one such Mera.chlca, test win ,e correlated with the scores on • S| 
.„ another hierarchical test ..rived fro™ the ,sa»e battery of ite^s. nu^i'^M 
, ;ot a severe problem as the correlations tend no. ■ ■ hi. ^ Thus it is 
: possible to ta.e a ,roup of logical,, ho.o,.n , ., .i, the^ti^SS 

: H,erarchical Analysis „del "cull-out" those , noV part of.. • ■ 

hierarchical test domain, the regaining ite^s forcing, nor^atively-^ ^ieri^ 'slf 
^archical test or a group of hierarchical subtests. Such cleaned up hfe-i^^^l 
-chical tests ^ight properly be referred to as nor^atively-based hierarchicafl 

■ tests. «hen such a test is determined fro™ a single content area it .ighi ' 
bereferredtoasanjH^UvelvbMedcr^ A- / )Gt 

. using the multiple hierarchical analysis model in conjunction with the S 

■ responses of 1236 children ages 4.5 to approximately 6.0 to the 50 items 

: ten hierarchical subtests were identified. Of considerable i^portani^wei^'fH 

the first two hierarchical subtests which were composed of 41 of the origin^ 
_ 50 items. The first subtest is composed of 31 items while' the,second:subtest^'fl 
.s composed Of an additional ten items not on the first subtest. '^'K. ;:<'M 

Jhe 31 item hierarchical subtest has an astonishingly high Kuder--'^ ' ' 
Richardson 20 reliability (Ferguson. 1971) of .97. The reproducibility of' 
^ this subtest is .88. The error of prediction distribution is reported 'in 
Table 3. The second hierarchical subtest is somewhat of a disappointment^- ''i'^ 
with a reliability of .484. when corrected to a reliability equivalent to a , ' f 
31 nem test the reliability becomes .74, and a reproducibility of /ai /ThesP^^Sl 
figures are not necessarily poor but relative to the 31 ite™ hierarchical . ■'M 
test they leave much to be desired. ' 

Because, the first subtest has such fine properties serious considellsll 
Should be given to the additional testing or retesting;of those childriyth'Si 



KM 



Table 3. Frequency of errors of prediction distribution for 1236 children *, 
on 31 item and 10 item hierarchical subtests. 
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error scores of 6, 7 and 8.- Clearly these errors suggest that . f,;; 

their response patterns are drastically different from the response patterns- .V.f" 
of the 1216 children. Presently we are preparing to gather the results of - .v^/rj " 
the first year of achievement for these 1236 children. Certainly a linear ■ ^^-yp^- 
regression is planned for use with the wc"^" s^haved composite scores which :;;-v,I* . K 
range from a low of approximately 5 approximately 30. Also..: 

being planned as an alternative to liriL ^^sion is an expectancy table 

■ ... '"-i'V^i^ 

approach. Although the achievement data have not yet been obtained it may ■• '-v ■.. 
be informative to illustrate how an expectancy table is developed as an / , 
alternative to linear regression. • .,' 

Assume as an independent variable the error of prediction score. As a . , a' 
dependent variable one might consider ranges of achievement as opposed, to ; ^'^■^^^^ 
specific scores or subjective judgements of teachers. Assume that the depen-'^-^.j^r'^J 
dent variable is a teacher's si iec^ ve judgement of a child's achievement. 
A linear regression approach would most likely utilize the composite score 
on the test and a numerical index of achievement. The cell entries are hy po- . ^^/j. 
thetical but in p tice they would represent the frequency of children obtain^^p^ 
ing the particular error o; prediction associated withttie row and the teacher ^v;.^^^^^^ 
rating associated with the column. Dividing any row entry by a row total > ; .^^i 
^will define the probability of a child who^obtained the row. error of prediction -'^^^^^^^^ 



receiving the column rating. * ^v-fi^i 

Implicit in this table is a major hypothesis of this paper---mainly thatV,.^^^^^^^ 
children who are normatively atypical in their performance on a particular : 
cognitive test will be normatively atypical in their school performance or . be 
^ 'learning disabled. An error score of zero is only indicative of the laf'j^.'PT ^^riliiS 
^confusion in an individual's composite score. Table 4 simply implies thati/fP^ 



n 



Table 4. Illustrative example of an expectancy tabli 
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i- - , ■ ^ • ' -i '^^ 

child with an error score of zero has an equal probability of being caie-yf^^'' - l"^ 

gorized into any one of the five teacher rating categoreis. This follows 

logically as level of achievement should ordinarily predict teacher's ratings >'vlrS 

• should occur independently of error prediction. (Strictly speaking -^^i'-^Mv^^^ 
there will be fewer errors of predictions associated with very low and very ./. 

• . ■-• > '• : y ^ V ] ^ '• ••'^ 

• high composite scores.) If the total table were converted to probabilities'; V : 'Si 

..;based on row totals, it would be found that the greatest probability of being "-'-SM 
■'■ V . ■ - . ■ ■■ s*,:^^!*' 

rated poor is associated with a high er--- )f prediction. Alternatively with- H^m^ 

in a linear regression framework these same 20 individuals would be the ones' . 

for whom the greatest errors of linear prediction would occur. : . . "" ' V..'5 

In time it is hoped that the adequacy of this prediction model will be ■ ^' ■ 

established. Clearly the accurracy of this model from a learning disability ^■■tiO-^^' 

.' framework is dependent upon the content domain of the te<>t and its logical l' ' < '^''f!^ 

•''■''^'^■::■;;i• 
■ . relationship to achievement. .., ; . ' ■ i^^S^ 

Example 4: Can We Make A Silk Purse Fro m a Sow's Ear? ' * ' ' <ii''- 

■ ■ — ■ ^ — • ' . J ■ . Xt'; >^rKi 



As previously noted the use of a composite score implicitly assumes a " ^v'-n 
hierarchical test. How well is this assumption met with real-life standardized.S'-'V^^^ 
data? A subtest of a prominent American standardized test was evaluated with 



regard to certain properties of a hierarchical test. 



Utilizing the response patterns of 83 second grade children (a totalV-' ' '^'1^^ 
population from one school), five of whom were labled as learning disabled^ ';• /^'^.'^^'t^^^^^^ 
a reproducibility of .72 was obtained for the 32 items. The consequences of '.. -^S^^jir: 
such a low reproducbiliity are best characterized by the error frequencies f v -'. 



in Table 5. 



■ . -.^ ■■■■ .V'^^#^ 
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. Table 5. 



Frequencies of errors of prediction on a prominent 35 itero*V?-'?!,:'';riCVi§:i'| 
standardized subtest. ■ -^V'- ^■.)^y;t^Ii^ 
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■ ' i„d one attempted to predict ::..e r..ponse patterns for these children . 

,t least 9 of the children would hr e <«iv seven iten,s that were predicted . 

as being correct for the„ and they would have responded correctly to seven y>. 
■:i4ns;tU were predicted as being correct for the.. 14 errors for ^'^^^^^g A'^^m 
: the' nine Children. . In the previous example only three percent of the .m^p.:- : 

had l2 or »,ore errors of prediction whereas on this subtest 35 percent l^:^- ;..;sf| 
. 12 oi^«re errors of prediction, yet this instru«nt has only four more it^s.- J^'^A 

On the average ,o U possible to predict 72 percent of a,.y : • 

Child's response pattern. The degree of composUe confusion is i-ense oh 

this particular 1nstru«nt. Although the validity «st be low for an instru,, 
'>«„t:w;'h such great composite confusion Ironically the Kuder-Richardson 2C^ , ^ : ^ 

rellat.)ltiy estimate for this instrument is .78. 

' ''The publishers Claim that this subtest measures four different components,.;; ,:-,7 
The Wnent subtests were analyzed with the following reproducibilities ^4;- 
78. :79 and .73 with corrected relaibilities of .80. .71. .80 and .78 respec, . 
tively. These subtests show little improvement over the original .subtest as ...'I^IJI 
.. the two largest reproducibilities are associated with subtests of seven and 

five items respectively. ')■^ ::'■'^^:^ 

.' -^^' in an attempt to identic latent scales the Multiple Hierarchical-Analysis 
: r„ael was applied to the data. The analysis defined 13 hierarchical subtest,. ' 

^„,th ten subtests Showing a greater reproducibility than the four subtests^ - ^ 

" defined by the publishers. The subscale item content ranged from a low "f... 
• ■ tw.^ to a)h1gh of seven items. The subscales are sc^arized in Tabl^ 6., 

\ . Although one must be skeptical regarding the use of a two or three .item ■ 

subtest it does not seem at all unreasonable to use a 6. 7. 8 or 9 Hem s^b-> .^;^|J 

■ "■teiV,especially given the large percentage of low errors of prediction...,.,. ^■|V|^ 
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Table 6, Summary table for the 13 normati vely-based subtests determined from 
a prominent American standardized subtest. 
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^Dashes are used when the number of errors is not possible. " >\' 

2REP refers to subtest reproducibility. 

^Normalized reliability refers to reliabilities corrected by the Spearman- 
Brown prophesy (Ferguson, 1971) to a magnitude that would b6 associated 
with a 35 item test. , . . 



When utilizing subtests composed of a restricted number of items one must ' v 
keep in mind the consequences of an associated restricted variance for the 
composite scores if they are used in a linear regression or any parametric ♦ , 

r 

analysis. 'Although it has not been mentioned thn^ ' because it has not. 
been a problem in the Lxamplcb, it is possible thai regardless of the number 



of items in a subscale there may be a restricted variance with the composite .v 
scores if the subtest items are homogeneous with regard to difficulty. ' Con- • • ;/^r.4rv 
trary to much traditional psychometric literature it is desirable to hdi^e '^V!^\^ij*^^ 



heterogeneous item difficulty on a test if it is to have the properties of a . J\..<; 

hierarchical test. ' ^ ' . .^.3 

After all of the efforts to obtain the subtests, three of the learning. ■ .rf iV) 

disabled children were not but two were associated with extreme errors of ^ ,\ , >•:•**: 
' ■ . ■ ; ■'■ 'v 

prediction on certain subtests. Most likely there were not enough learning -^-^J, 

disabled children identified in the sample to allow a reasonable analysis- of ^^..V i^^lj- 

the errors of prediction. ' . ' ^*V/' s 

Finally in response to the subtitle of this section — maybe, but, it t • ; .\ -^^K 

. '■• -..■.".IS 

will require effort. - ' * 

Summary - 
In this manuscript a new type of measurement model was discussed, the • ^^.^^ 

... v< • :-tH 

hierarchical test. Unlike traditional tests which result in a composite score u jv'- 
the hierarchical test was shown to have two associated scores; a composite *. 



score and an error of prediction score. Utilizing an artificial data set as 



a first illustrative example most of the basic characteristics of a hierarchical;;^,, 
test were identified and their computations were discussed verablly. Three . 
additional real-life examples were presented. Under the assumption that most ^'S4\ 
researchers have a working knowledge of composite scores and their research v. 



21 "''- 'm^::-'-' ■ 'M 



uses the discussion of such scores '^n* limal The discussi within the 
illustrative examples emphasized ' iioor ce .3nd use of fie ei • or or pre- 
diction scores. The examples demoustrated: the use of a hierarchical test 
in the comparative analysis of certain cognitive processing of learning, dis^r^ 
abled and normal children; a method establishing a hierarchical test for the 
early identification of children with learning problems; how one might go' > 
about testing a standardized test for the properties of a hierarchical . testi- \ 

' Space does not p.^rmit the extensive use of illustrations however there ' 
are several additional uses of hierarchical tests worthy of brief mention, • . * 
The items of such a test might specify a learning hierarchy. in the sense of 
Gagn^, facilitating the assessment of an individual* s position within the 
specified hierarchy. In specifying learning hierarchies the items of such 
an instrument would also allow one to utilize chaining concepts establishing " 

item level empirical prerequisites for learning within the test. domain; pos- 

* * ■-. ■ ■ * 

sibly facilitating an empirically based aptitude interaction model. Alterna- . 

tively the items of such a test would facilitate the advancement of the state' ' , 

of knowledge with regard to task analysis. In addition to all of this the- 

composite score of a hierarchical test may be interpreted with considerable / /; 

validity within a traditional normative framework. -.V . 

. Finally it is possible that the composite scores of a hierarchical -^y. ; \ 

test would be predictive perhaps using expectancy tables or. linear regres-; • 

sions, of levels of achievement while error scores might be predicative of • 

specific categories of learning disabilities depending upon the content domain.^ 

The hierarchical test approach to the identification of learning disabilities 

may provide a means to better classification, better understanding and to ; > 

improved program development for learning disabled children. .Rinally the ■ 'i]^:^'^ 



«.,-jHo inrrpased understanding of the 

. orocessing characteristics of vanous types of learning , 
;ovi.n.infor.ation for t.e pUnnin, of in.vi.ai education pro., 

■;''g^a.s. Testing of the model is just beginning. 

• Reference Note 

f^^Hof^ann. ... H«n^ (-srHpt in pr.pa.tion c^^^ 
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-k^ References ' r^^S 

I 'rerguson.O. « HH-I^ ^ ^ ; ■ 

f' McGraw-Hill, 1971. ^. ,i 

V L The basis for scalogra. analysis. In S. Stouffer, et al . . . . 

r PHnceton: Princeton University Press, 1970. 

H '.. Measuremeri Pnnceton New York: 

• ^- ' . p • not 1 The early growth of loajc in the chijd - New 

V, '^ Inhelder, B. and Piaget, J. The earix y ^ 

r w.w. Norton and Company, Inc., 1964. ' 



